End-to-end attention-based distant speech recognition with Highway LSTM
نویسنده
چکیده
End-to-end attention-based models have been shown to be competitive alternatives to conventional DNN-HMM models in the Speech Recognition Systems. In this paper, we extend existing end-to-end attentionbased models that can be applied for Distant Speech Recognition (DSR) task. Specifically, we propose an end-to-end attention-based speech recognizer with multichannel input that performs sequence prediction directly at the character level. To gain a better performance, we also incorporate Highway long short-term memory (HLSTM) which outperforms previous models on AMI distant speech recognition task.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملResidual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition
In this paper, a novel architecture for a deep recurrent neural network, residual LSTM is introduced. A plain LSTM has an internal memory cell that can learn long term dependencies of sequential data. It also provides a temporal shortcut path to avoid vanishing or exploding gradients in the temporal domain. The residual LSTM provides an additional spatial shortcut path from lower layers for eff...
متن کاملLSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition
Popularized by the long short-term memory (LSTM), multiplicative gates have become a standard means to design artificial neural networks with intentionally organized information flow. Notable examples of such architectures include gated recurrent units (GRU) and highway networks. In this work, we first focus on the evaluation of each of the classical gated architectures for language modeling fo...
متن کاملAttention-Based LSTM with Multi-Task Learning for Distant Speech Recognition
Distant speech recognition is a highly challenging task due to background noise, reverberation, and speech overlap. Recently, there has been an increasing focus on attention mechanism. In this paper, we explore the attention mechanism embedded within the long short-term memory (LSTM) based acoustic model for large vocabulary distant speech recognition, trained using speech recorded from a singl...
متن کاملAttention-Based Models for Text-Dependent Speaker Verification
Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end textdepend...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1610.05361 شماره
صفحات -
تاریخ انتشار 2016